Search CORE

38 research outputs found

Single-molecule real-time sequencing combined with optical mapping yields completely finished fungal genome

Author: Datema Erwin
Faino Luigi
Janssen Antoine
Seidl Michael F.
Thomma Bart P. H. J.
Van Den Berg Grardy C. M.
Wittenberg Alexander H. J.
Publication venue: 'American Society for Microbiology'
Publication date: 01/01/2015
Field of study

Next-generation sequencing (NGS) technologies have increased the scalability, speed, and resolution of genomic sequencing and, thus, have revolutionized genomic studies. However, eukaryotic genome sequencing initiatives typically yield considerably fragmented genome assemblies. Here, we assessed various state-of-the-art sequencing and assembly strategies in order to produce a contiguous and complete eukaryotic genome assembly, focusing on the filamentous fungus Verticillium dahliae. Compared with Illumina-based assemblies of the V. dahliae genome, hybrid assemblies that also include PacBio- generated long reads establish superior contiguity. Intriguingly, provided that sufficient sequence depth is reached, assemblies solely based on PacBio reads outperform hybrid assemblies and even result in fully assembled chromosomes. Furthermore, the addition of optical map data allowed us to produce a gapless and complete V. dahliae genome assembly of the expected eight chromosomes from telomere to telomere. Consequently, we can now study genomic regions that were previously not assembled or poorly assembled, including regions that are populated by repetitive sequences, such as transposons, allowing us to fully appreciate an organism’s biological complexity. Our data show that a combination of PacBio-generated long reads and optical mapping can be used to generate complete and gapless assemblies of fungal genomes. IMPORTANCE Studying whole-genome sequences has become an important aspect of biological research. The advent of nextgeneration sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBiogenerated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome sanalyses to facilitate functional studies into an organism’s biology

Directory of Open Access Journals

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

High-throughput bioinformatics with the Cyrille2 pipeline system

Author: Datema Erwin
de Groot Joost CW
Fiers Mark WEJ
van der Burgt Ate
van Ham Roeland CHJ
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or <it>pipelines</it>. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1) a web based, graphical user interface (<it>GUI</it>) that enables a pipeline operator to manage the system; 2) the <it>Scheduler</it>, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3) the <it>Executor</it>, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.</p

Lirias

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Comparative BAC end sequence analysis of tomato and potato reveals overrepresentation of specific gene families in potato

Author: Buels Robert
Datema Erwin
Giovannoni James J
Mueller Lukas A
Stiekema Willem J
van Ham Roeland CHJ
Visser Richard GF
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Tomato (<it>Solanum lycopersicon</it>) and potato (<it>S. tuberosum</it>) are two economically important crop species, the genomes of which are currently being sequenced. This study presents a first genome-wide analysis of these two species, based on two large collections of BAC end sequences representing approximately 19% of the tomato genome and 10% of the potato genome. Results The tomato genome has a higher repeat content than the potato genome, primarily due to a higher number of retrotransposon insertions in the tomato genome. On the other hand, simple sequence repeats are more abundant in potato than in tomato. The two genomes also differ in the frequency distribution of SSR motifs. Based on EST and protein alignments, potato appears to contain up to 6,400 more putative coding regions than tomato. Major gene families such as cytochrome P450 mono-oxygenases and serine-threonine protein kinases are significantly overrepresented in potato, compared to tomato. Moreover, the P450 superfamily appears to have expanded spectacularly in both species compared to <it>Arabidopsis thaliana</it>, suggesting an expanded network of secondary metabolic pathways in the <it>Solanaceae</it>. Both tomato and potato appear to have a low level of microsynteny with <it>A. thaliana</it>. A higher degree of synteny was observed with <it>Populus trichocarpa</it>, specifically in the region between 15.2 and 19.4 Mb on <it>P. trichocarpa </it>chromosome 10. Conclusion The findings in this paper present a first glimpse into the evolution of Solanaceous genomes, both within the family and relative to other plant species. When the complete genome sequences of these species become available, whole-genome comparisons and protein- or repeat-family specific studies may shed more light on the observations made here.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Wageningen University & Research Publications

Population Genomic Analysis Reveals Differential Evolutionary Histories and Patterns of Diversity across Subgenomes and Subpopulations of Brassica napus L.

Author: Daniel C. Ilut
David Grant
Elodie Gazave
Erica E. Tassone
Erwin Datema
Hanneke M. A. Witsenboer
Jack Brown
James B. Davis
John M. Dyer
Matthew A. Jenks
Megan Wingerson
Michael A. Gore
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2016
Field of study

The allotetraploid species Brassica napus L. is a global crop of major economic importance, providing canola oil (seed) and vegetables for human consumption and fodder and meal for livestock feed. Characterizing the genetic diversity present in the extant germplasm pool of B. napus is fundamental to better conserve, manage and utilize the genetic resources of this species. We used sequence-based genotyping to identify and genotype 30,881 SNPs in a diversity panel of 782 B. napus accessions, representing samples of winter and spring growth habits originating from 33 countries across Europe, Asia and America. We detected strong population structure broadly concordant with growth habit and geography, and identified three major genetic groups: spring (SP), winter Europe (WE), and winter Asia (WA). Subpopulation-specific polymorphism patterns suggest enriched genetic diversity within the WA group and a smaller effective breeding population for the SP group compared to WE. Interestingly, the two subgenomes of B. napus appear to have different geographic origins, with phylogenetic analysis placing WE and WA as basal clades for the other subpopulations in the C and A subgenomes, respectively. Finally, we identified 16 genomic regions where the patterns of diversity differed markedly from the genome-wide average, several of which are suggestive of genomic inversions. The results obtained in this study constitute a valuable resource for worldwide breeding efforts and the genetic dissection and prediction of complex B. napus traits

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

The Research Repository @ WVU (West Virginia University)

De novo sequencing, assembly and analysis of the genome of the laboratory strain Saccharomyces cerevisiae CEN.PK113-7D, a model for modern industrial biotechnology

Author: Bosman Lizanne
Daran Jean-Marc
Daran-Lapujade Pascale
Datema Erwin
de Kok Stefan
de Ridder Dick
Heijne Wilbert HM
Klaassen Paul
Kötter Peter
Luttik Marijke A
Nielsen Jens
Nijkamp Jurgen F
Paddon Chris J
Platt Darren
Pronk Jack T
Reinders Marcel JT
van den Broek Marcel
van Ham Roeland C
Vongsangnak Wanwipa
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Saccharomyces cerevisiae CEN.PK 113-7D is widely used for metabolic engineering and systems biology research in industry and academia. We sequenced, assembled, annotated and analyzed its genome. Single-nucleotide variations (SNV), insertions/deletions (indels) and differences in genome organization compared to the reference strain S. cerevisiae S288C were analyzed. In addition to a few large deletions and duplications, nearly 3000 indels were identified in the CEN.PK113-7D genome relative to S288C. These differences were overrepresented in genes whose functions are related to transcriptional regulation and chromatin remodelling. Some of these variations were caused by unstable tandem repeats, suggesting an innate evolvability of the corresponding genes. Besides a previously characterized mutation in adenylate cyclase, the CEN.PK113-7D genome sequence revealed a significant enrichment of non-synonymous mutations in genes encoding for components of the cAMP signalling pathway. Some phenotypic characteristics of the CEN.PK113-7D strains were explained by the presence of additional specific metabolic genes relative to S288C. In particular, the presence of the BIO1 and BIO6 genes correlated with a biotin prototrophy of CEN.PK113-7D. Furthermore, the copy number, chromosomal location and sequences of the MAL loci were resolved. The assembled sequence reveals that CEN.PK113-7D has a mosaic genome that combines characteristics of laboratory strains and wild-industrial strains

Crossref

TU Delft Repository

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Wageningen University & Research Publications

Chalmers Research

Chalmers Publication Library

Hochschulschriftenserver - Universität Frankfurt am Main

The Genomes of the Fungal Plant Pathogens Cladosporium fulvum and Dothistroma septosporum Reveal Adaptation to Different Hosts and Lifestyles But Also Signatures of Common Ancestry.

We sequenced and compared the genomes of the Dothideomycete fungal plant pathogensCladosporium fulvum (Cfu) (syn. Passalora fulva) and Dothistroma septosporum (Dse) that are closely related phylogenetically, but have different lifestyles and hosts. Although both fungi grow extracellularly in close contact with host mesophyll cells, Cfu is a biotroph infecting tomato, while Dse is a hemibiotroph infecting pine. The genomes of these fungi have a similar set of genes (70% of gene content in both genomes are homologs), but differ significantly in size (Cfu \u3e61.1-Mb; Dse 31.2-Mb), which is mainly due to the difference in repeat content (47.2% in Cfu versus 3.2% in Dse). Recent adaptation to different lifestyles and hosts is suggested by diverged sets of genes. Cfu contains an α-tomatinase gene that we predict might be required for detoxification of tomatine, while this gene is absent in Dse. Many genes encoding secreted proteins are unique to each species and the repeat-rich areas in Cfu are enriched for these species-specific genes. In contrast, conserved genes suggest common host ancestry. Homologs of Cfu effector genes, including Ecp2 and Avr4, are present in Dse and induce a Cf-Ecp2- and Cf-4-mediated hypersensitive response, respectively. Strikingly, genes involved in production of the toxin dothistromin, a likely virulence factor for Dse, are conserved in Cfu, but their expression differs markedly with essentially no expression by Cfu in planta. Likewise, Cfu has a carbohydrate-degrading enzyme catalog that is more similar to that of necrotrophs or hemibiotrophs and a larger pectinolytic gene arsenal than Dse, but many of these genes are not expressed in planta or are pseudogenized. Overall, comparison of their genomes suggests that these closely related plant pathogens had a common ancestral host but since adapted to different hosts and lifestyles by a combination of differentiated gene content, pseudogenization, and gene regulation

HAL AMU

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Purdue E-Pubs

ProdInra

Computational pan-genomics: status, promises and challenges

Author: Abeel Thomas
Alkan Can
Baaijens Jasmijn
Bakker Paul
Boeva Valentina
Bonnal Raoul
Chiaromonte Francesca
Chikhi Rayan
Ciccarelli Francesca
Cijvat Robin
Datema Erwin
Dijkstra Louis
Duijn Cornelia
Dutilh Bas
Eichler Evan
El-Kebir Mohammed
Ernst Corinna
Eskin Eleazar
Garrison Erik
Ghaffaari Ali
Guryev Victor
Kersey Paul
Klau Gunnar
Kloosterman Wigard
Korbel Jan
Lameijer Eric-Wubbo
Langmead Benjamin
Marschall Tobias
Martin Marcel
Marz Manja
Medvedev Paul
Mu John
Mäkinen Veli
Neerincx Pieter
Novak Adam
Ouwens Klaasjan
Paten Benedict
Peterlongo Pierre
Pisanti Nadia
Porubsky David
Rahmann Sven
Raphael Benjamin
Reinert Knut
Ridder Dick
Ridder Jeroen
Rivals Eric
Sanders Ashley
Schlesner Matthias
Schulz-Trieglaff Ole
Schönhuth Alexander
Sheikhizadeh Siavash
Shneider Carl
Smit Sandra
The Computational Pan-Genomics Consortium
Valenzuela Daniel
Vandin Fabio
Wang Jiayin
Wessels Lodewyk
Ye Kai
Zhang Ying
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

International audienceMany disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains

INRIA a CCSD electronic archive server

Archivio della Ricerca - Università di Pisa

EUR Research Repository

HAL-MINES ParisTech

Archivio della ricerca della Scuola Superiore Sant'Anna

Radboud Repository

HAL-Rennes 1

Computational pan-genomics: Status, promises and challenges

Author: Abeel T. (Thomas)
Alkan C. (Can)
Baaijens J.A. (Jasmijn)
Bakker P.I.W. (Paul) de
Boeva V. (Valentina)
Bonnal R.J.P. (Raoul)
Chiaromonte F. (Francesca)
Chikhi R. (Rayan)
Ciccarelli F.D. (Francesca)
Cijvat C.P. (Robin)
Datema E. (Erwin)
Dijkstra L.J. (Louis)
Duijn C.M. (Cornelia) van
Dutilh B.E. (Bas)
Eichler E.E. (Evan)
El-Kebir M. (Mohammed)
Ernst C. (Corinna)
Eskin E. (Eleazar)
Garrison E. (Erik)
Ghaffaari A. (Ali)
Guryev V. (Victor)
Kersey P. (Paul)
Klau G.W. (Gunnar)
Kloosterman W.P. (Wigard)
Korbel J.O. (Jan)
Lameijer E.-W. (Eric-Wubbo)
Langmead B. (Benjamin)
Marschall T. (Tobias)
Martin M. (Marcel)
Marz M. (Manja)
Medvedev P. (Paul)
Mu J.C. (John)
Mäkinen V. (Veli)
Neerincx P.B.T. (Pieter)
Novak A.M. (Adam)
Ouwens K. (Klaasjan)
Paten B. (Benedict)
Peterlongo P. (Pierre)
Pisanti N. (Nadia)
Porubsky D. (David)
Rahmann S. (Sven)
Raphael B.J. (Benjamin)
Reinert K. (Knut)
Ridder D. (Dick) de
Ridder J. (Jeroen) de
Rivals E. (Eric)
Sanders A.D. (Ashley)
Schlesner M. (Matthias)
Schulz-Trieglaff O. (Ole)
Schönhuth A. (Alexander)
Sheikhizadeh S. (Siavash)
Shneider C. (Carl)
Smit S. (Sandra)
The Computational Pan-Genomics Consortium
Valenzuela D. (Daniel)
Vandin F. (Fabio)
Wang J. (Jiayin)
Wessels L.F.A. (Lodewyk)
Ye K. (Kai)
Zhang Y. (Ying)
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different Computational methods and paradigms are needed.We will witness the rapid extension of Computational pan-genomics, a new sub-area of research in Computational biology. In this article, we generalize existing definitions and understand a pangenome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a Computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations

CWI's Institutional Repository

Erasmus University Digital Repository

Homologues of potato chromosome 5 show variable collinearity in the euchromatin, but dramatic absence of sequence similarity in the pericentromeric heterochromatin

Author: A Ballvora
AM Anithakumari
AM Rickert
B Kloosterman
C Leonards-Schippers
C Navarro
Christian W B Bachem
CT Kelleher
D Hupalo
D Leister
D Schuermann
DM Spooner
DQ Ding
E Bakker
E Coen
EK Al-Dous
Erin H Bakker
Erwin Datema
F Rodriguez
G Benson
GA Torres
GA Tuskan
H Os Van
H Tang
Hans de Jong
Herman J van Eck
J Cao
J Krumsiek
J Tang
J Wu
Jan M de Boer
JE Bowers
JG Uitdewilligen
JL Bennetzen
JL Freeman
JM Boer De
JME Jacobs
JP Hamilton
JS Kim
K Arumuganathan
KJ Felcher
L Clarke
L Zamariola
LA Shearer
LK McHale
M Iovene
M Iovene
M Morgante
M Nei
M Zulawski
MC Schatz
MW Fiers
NM Springer
P Yu
PJ Wilson
Potato Genome Sequencing Consortium
R Velasco
R Velasco
RG Visser
Richard G F Visser
RK Saxena
Roeland C H J van Ham
RW Michelmore
S Brunner
S Kurtz
SK Sharma
TH Park
Theo J A Borm
Tomato Genome Consortium
TP Michael
W Zhu
X Tang
Xiaomin Tang
ZV Avramova
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Whole genome shotgun sequencing in a Solanum tuberosum interspecific backcross genotype

Author: Bachem Christian
Borm Theo
Cheng S.F.
Datema Erwin
de Boer Jan
Goverse Aska
Kloosterman Bjorn
Pan S.K.
van Eck Herman
van Ham Roeland
Visser Richard
Xu X.
Zhang B.
Publication venue: Wageningen University and Research Centre
Publication date: 01/01/2011
Field of study

Whole genome shotgun sequencing of hetrozygous diploid interspecific hybrid backcross (Solanum tuberosum ssp. phureja x Solanum tuberosum ssp. tuberosum) x Solanum tuberosum ssp. tuberosum) clone RH89-039-16